Supporting Incremental Join Queries on Ranked Inputs

نویسندگان

  • Apostol Natsev
  • Yuan-Chi Chang
  • John R. Smith
  • Chung-Sheng Li
  • Jeffrey Scott Vitter
چکیده

This paper investigates the problem of incremental joins of multiple ranked data sets when the join condition is a list of arbitrary user-defined predicates on the input tuples. This problem arises in many important applications dealing with ordered inputs and multiple ranked data sets, and requiring the top k solutions. We use multimedia applications as the motivating examples but the problem is equally applicable to traditional database applications involving optimal resource allocation, scheduling, decision making, ranking, etc. We propose an algorithm J that enables querying of ordered data sets by imposing arbitrary userdefined join predicates. The basic version of the algorithm does not use any random access but a J PA variation can exploit available indexes for efficient random access based on the join predicates. A special case includes the join scenario considered by Fagin [1] for joins based on identical keys, and in that case, our algorithms perform as efficiently as Fagin’s. Our main contribution, however, is the generalization to join scenarios that were previously unsupported, including cases where random access in the algorithm is not possible due to lack of unique keys. In addition, J can support multiple join levels, or nested join hierarchies, which are the norm for modeling multimedia data. We also give -approximation versions of both of the above algorithms. Finally, we give strong optimality results for some of the proposed algorithms, and we study their performance empirically. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 27th VLDB Conference, Roma, Italy, 2001

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs

In many applications, the underlying data (the web, an XML document, or a relational database) can be seen as a graph. These graphs may be enriched with weights, associated with the nodes and edges of the graph, denoting application specific desirability/penalty assessments, such as popularity, trust, or cost. A particular challenge when considering such weights in query processing is that resu...

متن کامل

Adaptive and Incremental Processing for Distance Join Queries

A spatial distance join is a relatively new type of operation introduced for spatial and multimedia database applications. Additional requirements for ranking and stopping cardinality are often combined with the spatial distance join in on-line query processing or internet search environments. These requirements pose new challenges as well as opportunities for more efficient processing of spati...

متن کامل

Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources

Recently there has been a growing interest in join query evaluation for scenarios in which inputs arrive at highly variable and unpredictable rates. In such scenarios, the focus shifts from completing the computation as soon as possible to producing a prefix of the output as soon as possible. To handle this shift in focus, most solutions to date rely upon some combination of streaming binary op...

متن کامل

Reverse Engineering Top-k Join Queries

Ranked lists have become a fundamental tool to represent the most important items taken from a large collection of data. Search engines, sports leagues and e-commerce platforms present their results, most successful teams and most popular items in a concise and structured way by making use of ranked lists. This paper introduces the PALEO-J framework which is able to reconstruct top-k database q...

متن کامل

Rank-aware, Approximate Query Processing on the Semantic Web

The amount of data on the WWW that adheres to Semantic Web standards is rapidly increasing. Search over this huge Web data corpus frequently leads to queries having large result sets. So, in order to discover data elements, which satisfy a given information need, users must rely on ranking techniques to sort results according to their relevance. Unfortunately, processing queries with ranked res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001